openai codex
ClusterFusion: Hybrid Clustering with Embedding Guidance and LLM Adaptation
Xu, Yiming, Yuan, Yuan, Viswanathan, Vijay, Neubig, Graham
Text clustering is a fundamental task in natural language processing, yet traditional clustering algorithms with pre-trained embeddings often struggle in domain-specific contexts without costly fine-tuning. Large language models (LLMs) provide strong contextual reasoning, yet prior work mainly uses them as auxiliary modules to refine embeddings or adjust cluster boundaries. We propose ClusterFusion, a hybrid framework that instead treats the LLM as the clustering core, guided by lightweight embedding methods. The framework proceeds in three stages: embedding-guided subset partition, LLM-driven topic summarization, and LLM-based topic assignment. This design enables direct incorporation of domain knowledge and user preferences, fully leveraging the contextual adaptability of LLMs. Experiments on three public benchmarks and two new domain-specific datasets demonstrate that ClusterFusion not only achieves state-of-the-art performance on standard tasks but also delivers substantial gains in specialized domains. To support future work, we release our newly constructed dataset and results on all benchmarks.
- North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
- North America > United States > California (0.04)
The Rise of AI Teammates in Software Engineering (SE) 3.0: How Autonomous Coding Agents Are Reshaping Software Engineering
Li, Hao, Zhang, Haoxiang, Hassan, Ahmed E.
The future of software engineering--SE 3.0--is unfolding with the rise of AI teammates: autonomous, goal-driven systems collaborating with human developers. Among these, autonomous coding agents are especially transformative, now actively initiating, reviewing, and evolving code at scale. This paper introduces AIDev, the first large-scale dataset capturing how such agents operate in the wild. Spanning over 456,000 pull requests by five leading agents--OpenAI Codex, Devin, GitHub Copilot, Cursor, and Claude Code--across 61,000 repositories and 47,000 developers, AIDev provides an unprecedented empirical foundation for studying autonomous teammates in software development. Unlike prior work that has largely theorized the rise of AI-native software engineering, AIDev offers structured, open data to support research in benchmarking, agent readiness, optimization, collaboration modeling, and AI governance. The dataset includes rich metadata on PRs, authorship, review timelines, code changes, and integration outcomes--enabling exploration beyond synthetic benchmarks like SWE-bench. For instance, although agents often outperform humans in speed, their PRs are accepted less frequently, revealing a trust and utility gap. Furthermore, while agents accelerate code submission--one developer submitted as many PRs in three days as they had in three years--these are structurally simpler (via code complexity metrics). We envision AIDev as a living resource: extensible, analyzable, and ready for the SE and AI communities. Grounding SE 3.0 in real-world evidence, AIDev enables a new generation of research into AI-native workflows and supports building the next wave of symbiotic human-AI collaboration. The dataset is publicly available at https://github.com/SAILResearch/AI_Teammates_in_SE3. > AI Agent, Agentic AI, Coding Agent, Agentic Coding, Software Engineering Agent
- North America > Canada > Ontario > Kingston (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- North America > United States > New Jersey > Hudson County > Hoboken (0.04)
Evaluation of OpenAI Codex for HPC Parallel Programming Models Kernel Generation
Godoy, William F., Valero-Lara, Pedro, Teranishi, Keita, Balaprakash, Prasanna, Vetter, Jeffrey S.
We evaluate AI-assisted generative capabilities on fundamental numerical kernels in high-performance computing (HPC), including AXPY, GEMV, GEMM, SpMV, Jacobi Stencil, and CG. We test the generated kernel codes for a variety of language-supported programming models, including (1) C++ (e.g., OpenMP [including offload], OpenACC, Kokkos, SyCL, CUDA, and HIP), (2) Fortran (e.g., OpenMP [including offload] and OpenACC), (3) Python (e.g., numba, Numba, cuPy, and pyCUDA), and (4) Julia (e.g., Threads, CUDA.jl, AMDGPU.jl, and KernelAbstractions.jl). We use the GitHub Copilot capabilities powered by OpenAI Codex available in Visual Studio Code as of April 2023 to generate a vast amount of implementations given simple
The Robots Are Coming: The Rise of OpenAI and ChatGPT - CTaccess
According to the New York Times, Google became the World's largest search engine in 2000 when it reached the point where it indexed over 1 billion web pages. Since then, Google has become a household word. We "Google it" to learn anything we need, want, or are just curious about. The supremacy of Google has reigned for the last 20 years in the search engine space. Interestingly, recent developments in artificial intelligence (AI) with the OpenAI and ChatGPT engine are causing many to question whether the rule of Google will end.
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.86)
GitHub Copilot - Wikipedia
GitHub Copilot is a cloud-based artificial intelligence tool developed by GitHub and OpenAI to assist users of Visual Studio Code, Visual Studio, Neovim, and JetBrains integrated development environments (IDEs) by autocompleting code.[1] Currently available by subscription to individual developers, the tool was first announced by GitHub on 29 June 2021, and works best for users coding in Python, JavaScript, TypeScript, Ruby, and Go.[2] On 29 June 2021, GitHub announced GitHub Copilot for technical preview in the Visual Studio Code development environment.[1][3] On 26 October 2021, GitHub Copilot was released as a plugin on the JetBrains marketplace.[4] On 27 October 2021, GitHub released the GitHub Copilot Neovim plugin as a public repository.[5] On 29 March 2022, GitHub officially announced Copilot's availability for the Visual Studio 2022 IDE.[6]
OpenAI Codex -- My Trials and Tribulations
Last year, OpenAI announced Codex, a model for efficient programming with the aid of Artificial Intelligence (AI). One of the videos uploaded to the OpenAI YouTube channel showed a live demo that was hard to believe even when seen with one's own eyes. With just a few lines of commands, it was possible to create a whole game in JavaScript. The level of the commands seemed somewhat high, but with Codex you can see that it is immediately able to implement the code and run the game. In this way, Codex is a model that helps people write code much more efficiently than they could on their own.
- Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.95)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.84)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.84)
MIT Researchers use OpenAI Codex to Build an An ML-based Mathematics Problem-generator
OpenAI Codex is one of the most powerful language-to-code GPT3-based neural networking platform for high-speed programming. OpeAI Codex is used in a large number of AI Machine Learning projects in a safe AGI environment. As the demand for Codex programmers increase in the current era, we are witnessing a large number of AI researchers also taking to OpenAI's GPT3 offering to improve their understanding of neural networks for complex problems. In one such development, a group of machine learning researchers and faculty members belonging to the MIT, Columbia University, Harvard University, and the University of Waterloo have built a machine learning algorithm using OpenAI Codex. This new algorithm can solve, explain and generate complex mathematical problems.
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (1.00)
Sourceless presents the first Cognitive Web
Formwelt, OpenAI Codex, Github Co-Pilot and other Artificial Intelligence projects will make the SourceLess Platform usable by absolutely anyone, being able to create anything just by using words (written or spoken). For example, by using the Formwelt language, anyone, regardless of nationality, can communicate in a direct and semantically correct way with OpenAI Codex and create anything in the digital world; you can create a complete and complex website in less than an hour. All these AI systems will be implemented inside the SourceLess Platform, thus everyone can have access to all the facilities of the new Web through a single domain (eg: str.domain). Education, Technology & Innovation -- these three pillars of the future are the foundations of the SourceLess Platform. The purpose of education in the Sourceless project is to transmit knowledge or foster skills and character traits. These aims may include the development of understanding, rationality, kindness, and honesty.
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.91)
- Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.70)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.55)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.82)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.82)